Predicting Neighbor Distribution in Heterogeneous Information Networks | Proceedings of the 2015 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics
نویسندگان
چکیده
Recently, considerable attention has been devoted to the prediction problems arising from heterogeneous information networks. In this paper, we present a new prediction task, Neighbor Distribution Prediction (NDP), which aims at predicting the distribution of the labels on neighbors of a given node and is valuable for many different applications in heterogeneous information networks. The challenges of NDP mainly come from three aspects: the infinity of the state space of a neighbor distribution, the sparsity of available data, and how to fairly evaluate the predictions. To address these challenges, we first propose an Evolution Factor Model (EFM) for NDP, which utilizes two new structures proposed in this paper, i.e. Neighbor Distribution Vector (NDV) to represent the state of a given node’s neighbors, and Neighbor Label Evolution Matrix (NLEM) to capture the dynamics of a neighbor distribution, respectively. We further propose a learning algorithm for Evolution Factor Model. To overcome the problem of data sparsity, the learning algorithm first clusters all the nodes and learns an NLEM for each cluster instead of for each node. For fairly evaluating the predicting results, we propose a new metric: Virtual Accuracy (VA), which takes into consideration both the absolute accuracy and the predictability of a node. Extensive experiments conducted on three real datasets from different domains validate the effectiveness of our proposed model EFM and metric VA.
منابع مشابه
Graph Regularized Meta-path Based Transductive Regression in Heterogeneous Information Network | Proceedings of the 2015 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics
A number of real-world networks are heterogeneous information networks, which are composed of different types of nodes and links. Numerical prediction in heterogeneous information networks is a challenging but significant area because network based information for unlabeled objects is usually limited to make precise estimations. In this paper, we consider a graph regularized meta-path based tra...
متن کاملClustering and Ranking in Heterogeneous Information Networks via Gamma-Poisson Model | Proceedings of the 2015 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics
Clustering and ranking have been successfully applied independently to homogeneous information networks, containing only one type of objects. However, real-world information networks are oftentimes heterogeneous, containing multiple types of objects and links. Recent research has shown that clustering and ranking can actually mutually enhance each other, and several techniques have been develop...
متن کاملCitation Prediction in Heterogeneous Bibliographic Networks | Proceedings of the 2012 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics
To reveal information hiding in link space of bibliographical networks, link analysis has been studied from different perspectives in recent years. In this paper, we address a novel problem namely citation prediction, that is: given information about authors, topics, target publication venues as well as time of certain research paper, finding and predicting the citation relationship between a q...
متن کاملGraph Regularized Meta-path Based Transductive Regression in Heterogeneous Information Network
A number of real-world networks are heterogeneous information networks, which are composed of different types of nodes and links. Numerical prediction in heterogeneous information networks is a challenging but significant area because network based information for unlabeled objects is usually limited to make precise estimations. In this paper, we consider a graph regularized meta-path based tra...
متن کاملOn Dynamic Link Inference in Heterogeneous Networks | Proceedings of the 2012 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics
Network and linked data have become quite prevalent in recent years because of the ubiquity of the web and social media applications, which are inherently network oriented. Such networks are massive, dynamic, contain a lot of content, and may evolve over time in terms of the underlying structure. In this paper, we will study the problem of dynamic link inference in temporal and heterogeneous in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015